智能论文笔记

Pushing the Limits of Learning-based Traversability Analysis for Autonomous Driving on CPU

Daniel Fusaro , Emilio Olivastri , Daniele Evangelista , Marco Imperoli , Emanuele Menegatti , Alberto Pretto

分类：机器人 | 计算机视觉

2022-06-07

自动驾驶的车辆和自动地面机器人需要一种可靠，准确的方法来分析周围环境的遍历以进行安全导航。本文提出并评估了一种基于机器学习的遍历性分析方法，该方法将基于SVM分类器的混合方法中的几何特征与基于外观的特征相结合。特别是，我们表明，整合一组新的几何和视觉特征并专注于重要的实施细节，可以显着提高性能和可靠性。已提出的方法已与最先进的深度学习方法进行了比较。在不同的复杂性方面，它的准确性为89.2％，表明其有效性和鲁棒性。该方法在CPU上完全运行，并在其他方法方面达到可比的结果，运行速度更快，并且需要更少的硬件资源。

translated by 谷歌翻译

Toward Human-AI Co-creation to Accelerate Material Discovery

Dmitry Zubarev , Carlos Raoni Mendes , Emilio Vital Brazil , Renato Cerqueira , Kristin Schmidt , Vinicius Segura , Juliana Jansen Ferreira , Dan Sanders

分类：机器学习 | 人工智能

2022-11-05

There is an increasing need in our society to achieve faster advances in Science to tackle urgent problems, such as climate changes, environmental hazards, sustainable energy systems, pandemics, among others. In certain domains like chemistry, scientific discovery carries the extra burden of assessing risks of the proposed novel solutions before moving to the experimental stage. Despite several recent advances in Machine Learning and AI to address some of these challenges, there is still a gap in technologies to support end-to-end discovery applications, integrating the myriad of available technologies into a coherent, orchestrated, yet flexible discovery process. Such applications need to handle complex knowledge management at scale, enabling knowledge consumption and production in a timely and efficient way for subject matter experts (SMEs). Furthermore, the discovery of novel functional materials strongly relies on the development of exploration strategies in the chemical space. For instance, generative models have gained attention within the scientific community due to their ability to generate enormous volumes of novel molecules across material domains. These models exhibit extreme creativity that often translates in low viability of the generated candidates. In this work, we propose a workbench framework that aims at enabling the human-AI co-creation to reduce the time until the first discovery and the opportunity costs involved. This framework relies on a knowledge base with domain and process knowledge, and user-interaction components to acquire knowledge and advise the SMEs. Currently,the framework supports four main activities: generative modeling, dataset triage, molecule adjudication, and risk assessment.

translated by 谷歌翻译

Characterizing and Detecting State-Sponsored Troll Activity on Social Media

Fatima Ezzeddine , Luca Luceri , Omran Ayoub , Ihab Sbeity , Gianluca Nogara , Emilio Ferrara , Silvia Giordano

分类：机器学习

2022-10-17

The detection of state-sponsored trolls acting in information operations is an unsolved and critical challenge for the research community, with repercussions that go beyond the online realm. In this paper, we propose a novel AI-based solution for the detection of state-sponsored troll accounts, which consists of two steps. The first step aims at classifying trajectories of accounts' online activities as belonging to either a state-sponsored troll or to an organic user account. In the second step, we exploit the classified trajectories to compute a metric, namely "troll score", which allows us to quantify the extent to which an account behaves like a state-sponsored troll. As a study case, we consider the troll accounts involved in the Russian interference campaign during the 2016 US Presidential election, identified as Russian trolls by the US Congress. Experimental results show that our approach identifies accounts' trajectories with an AUC close to 99\% and, accordingly, classify Russian trolls and organic users with an AUC of 97\%. Finally, we evaluate whether the proposed solution can be generalized to different contexts (e.g., discussions about Covid-19) and generic misbehaving users, showing promising results that will be further expanded in our future endeavors.

translated by 谷歌翻译

Lower Bounds on the Worst-Case Complexity of Efficient Global Optimization

Wenjie Xu , Yuning Jiang , Emilio T. Maddalena , Colin N. Jones

分类：机器学习

2022-09-20

有效的全球优化是一种广泛使用的方法，用于优化昂贵的黑盒功能，例如调谐参数，设计新材料等。尽管它很受欢迎，但鉴于其广泛使用，较少的关注来分析问题的固有硬度，重要的是要了解有效的全球优化算法的基本限制。在本文中，我们研究了有效的全球优化问题的最严重的复杂性，并且与现有的内核特异性结果相反，我们得出了一个统一的下限，以根据球的度量熵的指标，以实现有效的全局优化的复杂性在相应的繁殖内核希尔伯特空间〜（RKHS）中。具体而言，我们表明，如果存在确定性算法，该算法在$ t $函数评估中实现了任何函数$ f \ in s $ in s $ f \ in $ t $函数评估的次优差距，则有必要至少是$ \ omemega \ left（\ frac {\ log \ mathcal {n}（s（s（\ Mathcal {x}）），4 \ epsilon，\ | \ | \ cdot \ cdot \ | _ \ iftty）} {\ log（\ frac {\ frac {r} {r} {\ epsilon {\ epsilon }）}} \ right）$，其中$ \ mathcal {n}（\ cdot，\ cdot，\ cdot）$是覆盖号码，$ s $是$ 0 $ $ 0 $，RKHS中的RADIUS $ r $，并且$ s（\ mathcal {x}）$是可行套装$ \ mathcal {x} $的$ s $的限制。此外，我们表明，这种下限几乎与常用平方指数核的非自适应搜索算法和具有较大平滑度参数$ \ nu $的垫子\'ern内核所获得的上限匹配，最多可替换为$ $ $ d/2 $ by $ d $和对数项$ \ log \ frac {r} {\ epsilon} $。也就是说，我们的下限对于这些内核几乎是最佳的。

translated by 谷歌翻译

Joint Debiased Representation and Image Clustering Learning with Self-Supervision

Shunjie-Fabian Zheng , JaeEun Nam , Emilio Dorigatti , Bernd Bischl , Shekoofeh Azizi , Mina Rezaei

分类：计算机视觉 | 机器学习

2022-09-14

对比度学习是视觉表示学习最成功的方法之一，可以通过在学习的表示上共同执行聚类来进一步提高其性能。但是，现有的联合聚类和对比度学习的方法在长尾数据分布上表现不佳，因为多数班级压倒了少数群体的损失，从而阻止了学习有意义的表示形式。由此激励，我们通过适应偏见的对比损失，以避免群集中的少数群体类别的不平衡数据集来开发一种新颖的联合聚类和对比度学习框架。我们表明，我们提出的修改后的对比损失和分歧聚类损失可改善多个数据集和学习任务的性能。源代码可从https://anonymon.4open.science/r/ssl-debiased-clustering获得

translated by 谷歌翻译

Improved proteasomal cleavage prediction with positive-unlabeled learning

Emilio Dorigatti , Bernd Bischl , Benjamin Schubert

分类：机器学习

2022-09-14

抗原加工途径的硅硅建模准确性对于实现个性化表位疫苗设计至关重要。这种途径的一个重要步骤是，蛋白酶体将疫苗降解为较小的肽，其中一些将由MHC复合物呈现给T细胞。虽然最近预测MHC肽的表现引起了很多关注，但鉴于高通量质谱的MHC连接学的最新进展，蛋白酶体裂解预测仍然是一个相对未探索的区域。此外，由于这种实验技术不允许识别无法分裂的区域，因此最新的预测因子会在训练时会产生合成的负样本并将其视为真正的负面样本，即使其中一些实际上可能是肯定的。因此，在这项工作中，我们提出了一个新的预测指标，该预测因素通过扩展的数据集和稳固的未标记学习理论基础进行了培训，从而实现了蛋白酶体裂解预测的新最新。改进的预测能力反过来又可以使更精确的疫苗开发提高基于表位的疫苗的功效。可以在https://github.com/schubertlab/proteasomal-cleavage-puupl上获得代码和预估计的模型。

translated by 谷歌翻译

Bayesian learning of feature spaces for multitasks problems

Carlos Sevilla-Salcedo , Ascensión Gallardo-Antolín , Vanessa Gómez-Verdejo , Emilio Parrado-Hernández

分类： (统计)机器学习 | 机器学习

2022-09-07

本文提出了一个贝叶斯框架，用于构建非线性，简约的浅层模型，用于多任务回归。提出的框架依赖于这样一个事实，即随机傅立叶特征（RFF）可以通过极端学习机器将RBF内核近似，其隐藏层由RFF形成。主要思想是将同一模型的两个双重视图结合在单个贝叶斯公式下，将稀疏的贝叶斯极限学习机器扩展到多任务问题。从内核方法的角度来看，提出的公式有助于通过RBF内核参数引入先前的域知识。从极端的学习机的角度来看，新的配方有助于控制过度拟合并实现简约的总体模型（服务每个任务的模型共享联合贝叶斯优化中选择的相同的RFF集合）。实验结果表明，在同一框架内将内核方法和极端学习机器的优势相结合可能会导致这两个范式中的每一个范式独立地取得的性能显着改善。

translated by 谷歌翻译

Robust and Efficient Imbalanced Positive-Unlabeled Learning with Self-supervision

Emilio Dorigatti , Jonas Schweisthal , Bernd Bischl , Mina Rezaei

分类：机器学习

2022-09-06

从积极和未标记的（PU）数据中学习是一种设置，学习者只能访问正面和未标记的样本，而没有关于负面示例的信息。这种PU环境在各种任务中非常重要，例如医学诊断，社交网络分析，金融市场分析和知识基础完成，这些任务也往往本质上是不平衡的，即大多数示例实际上是负面的。但是，大多数现有的PU学习方法仅考虑人工平衡的数据集，目前尚不清楚它们在不平衡和长尾数据分布的现实情况下的表现如何。本文提议通过强大而有效的自我监督预处理来应对这一挑战。但是，培训传统的自我监督学习方法使用高度不平衡的PU分布需要更好的重新重新制定。在本文中，我们提出\ textit {Impulses}，这是\ usewanced {im}平衡\下划线{p} osive \ unesive \ usepline {u} nlabeLed \ underline {l}的统一表示的学习框架{p}。 \下划线{s}削弱了debiase预训练。 Impulses使用大规模无监督学习的通用组合以及对比度损失和额外重新持续的PU损失的一般组合。我们在多个数据集上进行了不同的实验，以表明Impuls能够使先前最新的错误率减半，即使与先前给出的真实先验的方法相比。此外，即使在无关的数据集上进行了预处理，我们的方法也表现出对事先错误指定和卓越性能的鲁棒性。我们预计，这种稳健性和效率将使从业者更容易在其他感兴趣的PU数据集上获得出色的结果。源代码可在\ url {https://github.com/jschweisthal/impulses}中获得

translated by 谷歌翻译

A Spanish dataset for Targeted Sentiment Analysis of political headlines

Tomás Alves Salgueiro , Emilio Recart Zapata , Damián Furman , Juan Manuel Pérez , Pablo Nicolás Fernández Larrosa

分类：自然语言处理

2022-08-30

几项作品已经研究了主观文本，因为它们可以在用户中引起某些行为。大多数工作都集中在社交网络中的用户生成的文本上，但是其他一些文本也包括对某些主题的观点，可能会影响政治决策期间的判断标准。在这项工作中，我们解决了针对新闻头条领域的有针对性情绪分析的任务，该领域由主要渠道在2019年阿根廷总统大选期间发布。为此，我们介绍了1,976个头条新闻的极性数据集，该数据集在2019年选举中以目标级别提及候选人。基于预训练的语言模型的最先进的分类算法的初步实验表明，目标信息有助于此任务。我们公开提供数据和预培训模型。

translated by 谷歌翻译

Human Decision Makings on Curriculum Reinforcement Learning with Difficulty Adjustment

Yilei Zeng , Jiali Duan , Yang Li , Emilio Ferrara , Lerrel Pinto , C. -C. Jay Kuo , Stefanos Nikolaidis

分类：人工智能 | 机器学习

2022-08-04

以人为本的人工智能考虑了人工智能表现的经验。尽管丰富的研究一直在通过全自动或弱监督学习来帮助AI实现超人类的表现，但较少的努力正在尝试AI如何量身定制人类对人类首选技能水平的限制。在这项工作中，我们指导课程加强学习结果朝着首选的绩效水平，通过从人类的决策过程中学习而不是太困难也不容易。为了实现这一目标，我们开发了一个便携式交互式平台，使用户能够通过操纵任务难度，观察性能并提供课程反馈来在线与代理商进行交互。我们的系统高度可行，使人类可以训练大规模的增强学习应用程序，这些学习应用需要数百万没有服务器的样品。结果证明了互动课程对涉及人类在环的增强学习的有效性。它显示强化学习绩效可以成功地与人类所需的难度水平同步调整。我们认为，这项研究将为实现流动和个性化的适应性困难打开新的大门。

translated by 谷歌翻译